AITopics | continual diffusion

Collaborating Authors

continual diffusion

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Continual Diffusion with STAMINA: STack-And-Mask INcremental Adapters

Smith, James Seale, Hsu, Yen-Chang, Kira, Zsolt, Shen, Yilin, Jin, Hongxia

arXiv.org Artificial IntelligenceNov-30-2023

Recent work has demonstrated a remarkable ability to customize text-to-image diffusion models to multiple, fine-grained concepts in a sequential (i.e., continual) manner while only providing a few example images for each concept. This setting is known as continual diffusion. Here, we ask the question: Can we scale these methods to longer concept sequences without forgetting? Although prior work mitigates the forgetting of previously learned concepts, we show that its capacity to learn new tasks reaches saturation over longer sequences. We address this challenge by introducing a novel method, STack-And-Mask INcremental Adapters (STAMINA), which is composed of low-ranked attention-masked adapters and customized MLP tokens. STAMINA is designed to enhance the robust fine-tuning properties of LoRA for sequential concept learning via learnable hard-attention masks parameterized with low rank MLPs, enabling precise, scalable learning via sparse adaptation. Notably, all introduced trainable parameters can be folded back into the model after training, inducing no additional inference parameter costs. We show that STAMINA outperforms the prior SOTA for the setting of text-to-image continual customization on a 50-concept benchmark composed of landmarks and human faces, with no stored replay data. Additionally, we extended our method to the setting of continual learning for image classification, demonstrating that our gains also translate to state-of-the-art performance in this standard benchmark.

continual diffusion, continual learning, diffusion, (13 more...)

arXiv.org Artificial Intelligence

2311.18763

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA

Smith, James Seale, Hsu, Yen-Chang, Zhang, Lingyu, Hua, Ting, Kira, Zsolt, Shen, Yilin, Jin, Hongxia

arXiv.org Artificial IntelligenceApr-12-2023

Recent works demonstrate a remarkable ability to customize text-to-image diffusion models while only providing a few example images. What happens if you try to customize such models using multiple, fine-grained concepts in a sequential (i.e., continual) manner? In our work, we show that recent state-of-the-art customization of text-to-image models suffer from catastrophic forgetting when new concepts arrive sequentially. Specifically, when adding a new concept, the ability to generate high quality images of past, similar concepts degrade. To circumvent this forgetting, we propose a new method, C-LoRA, composed of a continually self-regularized low-rank adaptation in cross attention layers of the popular Stable Diffusion model. Furthermore, we use customization prompts which do not include the word of the customized object (i.e., "person" for a human face dataset) and are initialized as completely random embeddings. Importantly, our method induces only marginal additional parameter costs and requires no storage of user data for replay. We show that C-LoRA not only outperforms several baselines for our proposed setting of text-to-image continual customization, which we refer to as Continual Diffusion, but that we achieve a new state-of-the-art in the well-established rehearsal-free continual learning setting for image classification. The high achieving performance of C-LoRA in two separate domains positions it as a compelling solution for a wide range of applications, and we believe it has significant potential for practical impact.

artificial intelligence, diffusion, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2304.06027

Country:

North America > United States (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre:

Research Report > Promising Solution (0.46)
Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback